EN FR
EN FR


Section: Application Domains

Application Domains

The typical IT projects to which our technologies contribute aim at efficient and flexible management of complex digital information. The form and nature of the data often varies: Web pages, Office or PDF documents, XML structured data (sometimes obtained through Web service gateways), thesauri, ontologies etc. From such heterogeneous, complex resources, interested parties aim at building storage and processing tools, enabling the efficient storage, classification, annotation, enrichment, and fine-grained search on such data. (Interesting areas of content management, which we do not address, are: audio and video data, natural language processing, data mining, access control and privacy. We collaborate up with other groups specialized in these topics.) Sample real-life applications that we have already worked on in this setting are:

  • Archiving filtered content from online information sources (journals, blogs, ...) with the purpose of recording their perspective on facts involving specific countries, key political actors etc. (EADS data gathering for intelligence purposes, also an application from the WebContent project)

  • Building an XML data warehouse out of public e-mails exchanged in a technical standardization body (in our particular case, the W3C) in order to enable a fine-grained social network analysis to determine key players, opinion leaders etc.

  • Building a complete processing chain for digital documents from the medical domain. The process may start with the digitization and text extraction from scanned documents (we does not work in this area), then continues with extraction of named entities, document annotation based on existing domain ontologies, mapping of documents to a central domain ontology, reasoning across scattered data sources for query answering, storing, indexing, and distributing the data (and query results) across distributed players.

  • Data produced and made public by numerous public administration offices (in France, Europe, and the world) opens many perspectives for integrating, analyzing, and combining data sources into added-value information sources. Time is also an essential dimension here; so is data matching and reconciliation, since the same entity may be referenced from many different viewpoints and reconciliation is needed when joining data sources. Users of such applications could be public administration analyzing the impact of its policies, social scientists and journalists which already work on the data (but gather it with much difficulty) etc. This applications is gathered from our collaboration with the DataPublica start-up.